Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proof-of-principle] allow metadata to be fetched from an API #1207

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jameshadfield
Copy link
Member

@jameshadfield jameshadfield commented Sep 15, 2020

Nextstrain (Auspice) currently requires all data within the main JSON (or sidecar JSONs such as frequencies). This has a number of benefits, but it makes certain use-cases hard.

One such case I've encountered periodically, but more regularly with COVID, is where metadata which is not integral to the phylogenetic tree (e.g. some epi metadata) is updated, resulting in the Nextstrain dataset being out-of-date. This is currently solved by rerunning augur (often augur export is all that's needed if the intermediate files have been stored), however it is often the case that the groups maintaining the metadata are separate from those running the bioinformatics, and thus an extra communication step is required and there's a period where things are out of sync. This seems especially true for epi data, which are often subject to updates & amendments as the situation unfolds.

What if we could define this metadata somewhere else, like a google sheet?

Here I've implemented a proof-of-principle to explore the feasibility of storing this metadata outside of the auspice JSON and instead having Auspice fetch it via an API.

image

The dataset viewable at https://auspice-fetch-metadata-3t0frfv.herokuapp.com/zika-tutorial-metadata-via-api is the zika-tutorial but without the region defined in the tree in the JSON. The coloring metadata looks like:

      { "key": "region",
        "title": "Region (via API request)",
        "EXPERIMENTAL_google_sheets_id": "10x5-h2_zpjMWoW-m4SAY69KVKdi4AXX0Tw1n-aTjons",
        "type": "categorical"
      },

The actual region metadata is stored at the publicly accessible google sheet https://docs.google.com/spreadsheets/d/10x5-h2_zpjMWoW-m4SAY69KVKdi4AXX0Tw1n-aTjons/edit#gid=0.

image

When you change the color-by to Region, the data is fetched from that google sheet, and then displayed by Auspice. This decouples the storage of (certain bits of metadata) from the JSONs.

For those with access to the nextstrain google drive, you can modify a value in that sheet, refresh auspice, and see the updated values! (Note: Currently, you have to explicitly change the color-by dropdown to region to get this functionality, but that's easily fixable).

Notes

  • The current implementation is proof-of-principle, and only works for colorings stored in publicly accessible Google Sheets (and uses a google API that may be discontinued in a few weeks).
  • The current implementation doesn't let this work on page load -- you have to specifically use the dropdown to select "region" as a color-by
  • This PR is intended to promote discussion about this functionality / direction
  • A common use case will be metadata requiring authentication. I'm not sure how to approach this.

@jameshadfield jameshadfield temporarily deployed to auspice-fetch-metadata-3t0frfv September 15, 2020 07:14 Inactive
@emmahodcroft
Copy link
Member

I really like this idea!! I have also run into similar issues and it definitely gets old fast. Robert Dyrdak and I manually curated loads of age & gender data for EVD68 and having something like this (not to mention we could both have been editing it) would have been very nice!

@jameshadfield jameshadfield temporarily deployed to auspice-fetch-metadata-gk76gbz November 18, 2020 09:28 Inactive
@jameshadfield jameshadfield added the experiment PRs which may never be merged label Jun 13, 2023
@jameshadfield jameshadfield marked this pull request as draft June 13, 2023 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment PRs which may never be merged
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

2 participants